Indonesian-English Cross Language Question Answering
نویسندگان
چکیده
Our Indonesian-English Cross Language Question Answering (CLQA) is divided into 4 components: question analyzer, keyword translator, passage retriever and answer finder component. The Indonesian question is inputted into a question analyzer which yields Indonesian keyword list, Indonesian question focus and question class. We defined the question class by using an SVM machine implemented in Weka[15]. Because Indonesian is a poor data resource language, we use a bigram frequency feature as an addition feature for the question classification. The Indonesian keywords are translated into English using an Indonesian-English bilingual dictionary. The English translations are composed into a boolean query to retrieve relevant passages. We select the passages within 3 highest IDF scores. In the answer finder, the answer is located by using an SVM method for text chunking implemented in Yamcha[4]. Different with other Indonesian-English CLQA[1,14], we do not tag the name entities in the target documents, instead we only do the POS tagging by using TreeTagger[12] for the target documents. Based on our experiment in Indonesian QA, we choose to use question class, question features and document features for the machine learning based answer finder. We also complement the WordNet distance feature for the document features. By using 284 questions as the test data, we achieved about 31.69% accuracy on top 5 answers which is better than other IndonesianEnglish CLQAs.
منابع مشابه
A Machine Learning Approach for an Indonesian-English Cross Language Question Answering System
We have built a CLQA (Cross Language Question Answering) system for a source language with limited data resources (e.g. Indonesian) using a machine learning approach. The CLQA system consists of four modules: question analyzer, keyword translator, passage retriever and answer finder. We used machine learning in two modules, the question classifier (part of the question analyzer) and the answer ...
متن کاملUniversity of Indonesia's Participation in Question Answering at CLEF 2005
We present a report on our participation in the Indonesian-English question-answering task of the 2005 Cross-Language Evaluation Forum (CLEF). We chose to translate an Indonesian query set into English using a commercial machine translation tool called Transtool. We used a linguistic tool, the Monty Tagger, to find the answer to the question in a passage that has the same tagging as the query.
متن کاملFinding Answers to Indonesian Questions from English Documents
Our report describes the results of work in our participation in the IndonesianEnglish question-answering task of the 2006 Cross-Language Evaluation Forum (CLEF). In this work we translated an Indonesian query set into English using a machine translation tool available on the internet. Documents relevant to a question are first retrieved. The relevant documents are then divided into passages of...
متن کاملPerspectives on Chinese Question Answering Systems
Question Answering (QA) is becoming an increasingly important research area in natural language processing. Since 1999, many international question answering contests have been held at conferences and workshops, such as TREC, CLEF, and NTCIR. Thus far, eleven languages – Bulgarian, Dutch, English, Finnish, French, German, Indonesian, Italian, Japanese, Portuguese, and Spanish – have been tested...
متن کاملHeuristic and Syntactic Scoring for Cross-language Question Answering
This paper describes the Marsha Cross-Language Question Answering System used by Mount Holyoke College in the English-Chinese, Chinese-Chinese, and English-English subtasks of the NTCIR Cross-Language Question answering task. The system was most effective in the Chinese and English monolingual tasks. However, improved translations and better query type identification remain challenges for more ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007